Processing Ranked Queries with the Minimum Space
نویسندگان
چکیده
Practical applications often need to rank multi-variate records by assigning various priorities to different attributes. Consider a relation that stores students’ grades on two courses: database and algorithm. Student performance is evaluated by an “overall score” calculated as w1 ·gdb+w2 ·galg , where w1, w2 are two input “weights”, and gdb (galg) is the student grade on database (algorithm). A “top-k ranked query” retrieves the k students with the best scores according to specific w1 and w2. We focus on top-k queries whose k is bounded by a constant c, and present solutions that guarantee low worst-case query cost by using provably the minimum space. The core of our methods is a novel concept, “minimum covering subset”, which contains only the necessary data for ensuring correct answers for all queries. Any 2D ranked search, for example, can be processed in O(logB(m/B)+ c/B) I/Os using O(m/B) space, where m is the size of the minimum covering subset, and B the disk page capacity. Similar results are also derived for higher dimensionalities and approximate ranked retrieval.
منابع مشابه
بهبود الگوریتم انتخاب دید در پایگاه داده تحلیلی با استفاده از یافتن پرس وجوهای پرتکرار
A data warehouse is a source for storing historical data to support decision making. Usually analytic queries take much time. To solve response time problem it should be materialized some views to answer all queries in minimum response time. There are many solutions for view selection problems. The most appropriate solution for view selection is materializing frequent queries. Previously posed ...
متن کاملارائه روشی پویا جهت پاسخ به پرسوجوهای پیوسته تجمّعی اقتضایی
Data Streams are infinite, fast, time-stamp data elements which are received explosively. Generally, these elements need to be processed in an online, real-time way. So, algorithms to process data streams and answer queries on these streams are mostly one-pass. The execution of such algorithms has some challenges such as memory limitation, scheduling, and accuracy of answers. They will be more ...
متن کاملRelational Databases Query Optimization using Hybrid Evolutionary Algorithm
Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...
متن کاملمدل جدیدی برای جستجوی عبارت بر اساس کمینه جابهجایی وزندار
Finding high-quality web pages is one of the most important tasks of search engines. The relevance between the documents found and the query searched depends on the user observation and increases the complexity of ranking algorithms. The other issue is that users often explore just the first 10 to 20 results while millions of pages related to a query may exist. So search engines have to use sui...
متن کاملBranch-and-bound processing of ranked queries
Despite the importance of ranked queries in numerous applications involving multi-criteria decision making, they are not efficiently supported by traditional database systems. In this paper, we propose a simple yet powerful technique for processing such queries based on multidimensional access methods and branch-and-bound search. The advantages of the proposed methodology are: (i) it is space e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006